Data Quality Management in a Database Cluster with Lazy Replication
نویسندگان
چکیده
Abstract We consider the use of a database cluster with lazy replication. In this context, controlling the quality of replicated data based on users’ requirements is important to improve performance. However, existing approaches are limited to a particular aspect of data quality. In this paper, we propose a general model of data quality which makes the difference between “freshness” and “validity” of data. Data quality is expressed through divergence measures from the data with perfect quality. Users can thus specify the minimum level of quality for their queries. This information can be exploited to optimize query load balancing. We implemented our approach in our Refresco prototype. The results show that freshness control can help increase query throughput significantly. They also show significant improvement when freshness requirements are specified at the relation level rather than at the database level.
منابع مشابه
Robust Snapshot Replication
An important technique to ensure the scalability and availability of clustered computer systems is data replication. This paper describes a new approach to data replication management called Robust Snapshot Replication. It combines an update anywhere approach (so updates can be evaluated on any replica, spreading their load) with lazy update propagation and snapshot isolation concurrency contro...
متن کاملDatabase Replication: If You Must be Lazy, be Consistent
Due to severe performance penalties associated with synchronous replication, there is a significant interest in asynchronous replica management protocols. Lazy protocols currently in use either do not guarantee consistency and serializability as needed by transactional semantics or they impose restrictions on placement of data and which data object can be updated. In this paper we consider an a...
متن کاملReplica Refresh Strategies in a Database Cluster
Relaxing replica freshness has been exploited in database clusters to optimize load balancing. However, in most approaches, refreshment is typically coupled with other functions such as routing or scheduling, which make it hard to analyze the impact of the refresh strategy itself on performance. In this paper, we propose to support routing-independent refresh strategies in a database cluster wi...
متن کاملFine-grained Refresh Strategies for Managing Replication in Database Clusters
Relaxing replica freshness has been exploited in database clusters to optimize load balancing. In this paper, we propose to support both routing-dependant and routing-independent refresh strategies in a database cluster with multi-master lazy replication. First, we propose a model for capturing refresh strategies. Second, we describe the support of this model in a middleware architecture for fr...
متن کاملPreventive Multi-master Replication in a Cluster of Autonomous Databases
We consider the use of a cluster of PC servers for Application Service Providers where applications and databases must remain autonomous. We use data replication to improve data availability and query load balancing (and thus performance). However, replicating databases at several nodes can create consistency problems, which need to be managed through special protocols. In this paper, we presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JDIM
دوره 3 شماره
صفحات -
تاریخ انتشار 2005